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Abstract 

An approximate program transformation is a trans- 
formation that can change the semantics of a program 
within a specified empirical error bound. Such trans- 
formations have wide applications: they can decrease 
computation time, power consumption, and memory 
usage, and can, in some cases, allow implementations 
of incomputable operations. Correctness proofs of ap- 
proximate program transformations are by definition 
quantitative. Unfortunately, unlike with standard pro- 
gram transformations, there is as of yet no modular 
way to prove correctness of an approximate transfor- 
mation itself. Error bounds must be proved for each 
transformed program individually, and must be re- 
proved each time a program is modified or a different 
set of approximations are applied. 

In this paper, we give a semantics that enables 
quantitative reasoning about a large class of approx- 
imate program transformations in a local, compos- 
able way. Our semantics is based on a notion of 
distance between programs that defines what it means 
for an approximate transformation to be correct up 
to an error bound. The key insight is that distances 
between programs cannot in general be formulated 
in terms of metric spaces and real numbers. Instead, 
our semantics admits natural notions of distance for 
each type construct; for example, numbers are used 
as distances for numerical data, functions are used 
as distances for functional data, and polymorphic 
lambda-terms are used as distances for polymorphic 
data. We then show how our semantics applies to two 
example approximations: replacing reals with fioating- 
point numbers, and loop perforation. 

1. Introduction 

Approximation is a fundamental concept in engi- 
neering and computer science, including notions such 



as floating-point numbers, lossy compression, and ap- 
proximation algorithms for NP-hard problems. Such 
techniques are often used to trade off accuracy of the 
result for reduced resource usage, for resources such 
as computation time, power, and memory. In addition, 
some approximation techniques are also used to ensure 
computability. For example, true representations of 
real numbers (e.g., Q, lH]), require some operations, 
such as comparison, to be incomputable; floating-point 
comparison, in contrast, is efficiently decidable on 
modern computers. 

Recently, there has been a growing interest in 
language-based approximations, where approximate 
program transformations are performed by the pro- 
gramming language environment fST], fTT\, fT9l, flSl, 
[4 1, [3 1, [ 16 1. Such approaches allow the user to give an 
exact program as a specification, and then apply some 
set of transformations to this specification, yielding an 
approximate program. The goal is for approximations 
to be performed on behalf of the programmer, either 
fully automatically or with only high-level input from 
the user, while still maintaining a given error bound; 
i.e., the goals are automation and correctness. These 
goals can in turn increase programmer productivity 
while helping to remove programmer errors, where 
the latter can be especially important, for example, in 
safety-critical systems. 

This leads us to a fundamental question: what does it 
mean for an approximate transformation to be correct? 
A good answer to this question must surely be quan- 
titative, since approximate transformations should not 
change the output by too much; i.e., they must respect 
user-specified error bounds. Correctness should also be 
modular, meaning that, in settings where approximate 
transformations Ti, . . . ,Tk are applied together to a 
program P, it should be possible to reduce the proof 
of correctness of {Ti, . . . ,Tk} to individual proof 
obligations for the T^s. Current formal approaches to 
approximate program transformations, however, do not 
permit such modular reasoning about approximations. 



Typically, they are tailored to specific forms of ap- 
proximation — for example, the use of floating-point 
numbers or loop perforation |fT9l (skipping certain iter- 
ations in long-running loops). Even when multiple ap- 
proximations can be combined, reasoning about them 
is monolithic; in the above example, if Ti is changed 
slightly, we would need to re-prove the correctness of 
P with respect to not just Ti but {Ti, . . . ,Tfc}. In 
addition, current approaches to approximate compu- 
tation f2], fJTl, fT^, fTSl, r4| are almost universally 
restricted to first-order programs over restricted data 
types. 

In this work, we improve on this state of the art by 
giving a general, composable semantics for program 
approximation, for higher-order programs with poly- 
morphic types. In our semantics, individual approx- 
imate transformations are proved to be quantifiably 
correct, i.e., to induce a given local error expression. 
Error expressions are then combined compositionally, 
yielding a top-level error expression for the whole pro- 
gram that is built up from the errors of the individual 
approximate transforms being used. This approach has 
a number of benefits. First, it allows for more portable 
proofs: an approximate transformation T can now be 
proved correct once, and the resulting error expression 
can be used many times in many different contexts. 
Second, it is mechanical and opens up opportunities for 
automation: approximation errors for a whole program 
and a set of disparate approximate transformations are 
generated simply by composing the errors of the indi- 
vidual transformations. Finally, our approach reduces 
the correctness of approximately transformed programs 
to the much easier question of whether a generated 
error expression is less than or equal to a given error 
bound. 

The key technical insight that makes our semantics 
possible is that, despite past work on using metric 
spaces and real numbers in program semantics (e.g., 
II20I . ifTSl . ifTTI ). we argue that real numbers cannot 
in general capture how, for example, the output error 
of a function depends on the input and its error. 
Instead, our approach allows arbitrary System F types 
for errors. We show that this can accurately capture 
errors, for example, by using functions as errors for 
functional data and polymorphic lambdas as errors for 
polymorphic data. To allow this, our semantics is based 
around a novel notion of an approximation type, which 
is a ternary logical relation ||6l, ifTTll between exact 
expressions e, approximate expressions a, and error 
expressions q. In addition to the above benefits, this 
approach can also handle changes of type between 
exact and approximate expressions, e.g., approximating 
real numbers by floating-point. 



The remainder of the paper is organized as fol- 
lows. Section l2] motivates our semantics with a high- 
level overview. Section [3] defines our input language, 
^ADT+^ which is System F with algebraic datatypes 
and built-in operations. Section|4]defines our semantics 
by defining the notion of approximation types men- 
tioned above. Section |5] then shows how our semantics 
of approximation types can be used to verify an 
approximating compiler which compiles real numbers 
into floating-point numbers and optionally performs 
loop perforation 1121 . 1191 . Note that the error bounds 
created for the real to floating-point approximation 
essentially yields a general approach to floating-point 
error analysis |TOl, that works for higher-order 
and even polymorphic programs. Finally, Section [6] 
discusses related work and Section |7] concludes. 

2. Approximating Programs 

The goal of this work is to give a semantics for ap- 
proximate transformations, which convert an exact pro- 
gram e into an approximate program a that, although 
not identical to the e, is within some quantifiable error 
bound q. Our notion of approximate transformation is 
very general, but it includes at least: 

Data Approximations: a uses a less exact datatype, 

such as floating-point numbers instead of reals; 
Approximations of Incomputable Operations: a 

uses inexact but computable versions of poten- 
tially incomputable operations, such as f{n) for 
finite n in place of limj;_j.oo f{x); and 
Approximate Optimizations: a performs a cheaper, 
less precise version of a computation, such as the 
identity function in place of sin(a;). 

The point of language-based approximations is that 
these transformations become part of the language 
semantics, which precisely captures the relationship 
between exact programs, their approximations, and the 
associated error bounds. 

As a simple yet illustrative example, consider an 
approximate transformation that replaces the sin op- 
erator with the identity function \x. x. Such an ap- 
proximation can greatly reduce computation, since sin 
(for floating-point numbers) can be a costly operation, 
while the identity function is a no-op. Intuitively, we 
know that this change does not greatly affect the output 
when X is close to 0. If, however, the output of a 
call to sin is then passed to a numerically sensitive 
operation, such as a reciprical, then the small change 
resulting from replacing sin by the identity could lead 
to a large change in the final result. Thus, our goal is 
to quantify the error introduced by this approximate 
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Figure 1 . Error of Approximating sin witin Xx. x 



transformation in a local, compositional way, allowing 
this error to be propogated through the rest of the 
program. 

Figure [T] allows us to visualize the error resulting 
from replacing sin with the identity function. We 
assume that the input in has already been approxi- 
mated, yielding an approximate input in' with some 
approximation error err. The exact output, sin(Mi), 
gets replaced by an approximate result equal to in' . 
The error for this expression can then be calculated as 
err + in — sin(i?i), as described in the figure. 

This example leads to the following observations: 

1) Errors are programs: Although the standard 
approach to quantification and errors is to use 
metric spaces and real numbers, the error expres- 
sion err + in — sin(m) in our example depends 
on both the input in and the input error err; i.e., 
it is a function, which cannot be represented by 
a single real number 

2) Errors are compositional: Approximation er- 
rors in the input to sin are substituted into the 
approximation error for sin itself, and this error 
is in turn passed to any further approximations 
that occur at a later point in the computation. 

In order to formalize the relationship between pro- 
grams, their approximations, and the resulting errors, 
we define a semantic notion below called an ap- 
proximation type. An approximation type is a form 
of logical relation (see, e.g., (6^, ifTTJI ). satisfying 
certain properties discussed below, that relates exact 
expressions e, approximate expressions a, and error 
expressions q. If A is an approximation type, we write 
{|a|}3^ for the set of all exact expressions e related by 
A to approximate expression a and error expressions 
q. Thus, e G {|a|}|^ means that e can be approximated 
by a with approximation error no greater than q when 
using approximation type A. 

As a first example, we define the approximation type 
Fl that relates real number expressions e, floating- 
point expressions a, and non-negative real number 
expressions q iff q is the distance between the e and the 
real number corresponding to a. For instance, the real 



number tt is within error 0.1415926 ... of the floating- 
point number 3, meaning that n E {|3|}p/^^'^^^^'' 
holds. If the only difference between a and e is that 
the former uses floating-point numbers in place of 
reals, then showing e e {|a|}Fi is essentially a form of 
floating-point error analysis. Using Fl is more general, 
however, because it allows the possibility that a per- 
forms further approximations, e.g., using the identity 
in place of sin. Taking this comparison further, Fl is 
specifically like an interval-based error analysis; we 
examine this more closely in Section [5721 

For functions, following Figure [T] we use an ap- 
proximation type where the error of approximating a 
function is itself a function, that maps exact inputs 
and their approximation errors to output approximation 
errors. More formally, if Ai and A2 are approxi- 
mation types for the input and output of a function, 
respectively, then then functional approximation type 
Ai ^ A2 is defined such that e e ^a.^'^ ^_^, iff 
e ei S {|a ai|}^'^^ ''^ for all ei, ci, and qi such that 

These notions can then be used to prove correctness 
of approximate tranformations, as follows. First, the 
designer of an approximate transformation gives an 
approximation rule for the judgment \- e ^^ a < q : A, 
stating that expressions matching e can be transformed 
into a with error at most q using approximation type 
A. Approximation rules are explained in more detail 
below, but, as an example, our sin approximation above 
can be captured as: 

h sin -^ Xx. X < Xx. Xq. q + x — sin(a;) : Fl => Fl 

After formulating this rule, it must then be proved 
correct, by proving that e e {|a|}^ for all e, a, and 
q that match the rule. As argued above, a significant 
benefit of this approach is that it allows approximate 
transformations to be analyzed and proved correct on 
their own, without reference to the programs in which 
they are being used. 

3. System F as a Logical Language 

In the remainder of this paper, we work in System 
F with algebraic datatypes (ADTs) (see, e.g.. Pierce 
fT3l), extended to allow uncountability and incom- 
putability. Specifically, ADTs can have uncountably 
many constructors, while built-in operations can per- 
form potentially incomputable functions. The former is 
useful for modeling the real numbers, while the latter 
is useful for modeling operations on real numbers such 
as comparison that are incomputable in general (e.g., 
Q, CD- We refer to this language as i^ADT+ 



The only additional technical machinery needed for 
these extensions to System F is to require that all built- 
in functions are definable in the meta-language (set 
theory or type theory). More specifically, we assume 
as given some set of built-in operation symbols /, 
each with a given type Tf and meta-language relation 
Rf relating allowed inputs for the function defined by 
/ to their corresponding outputs. Further, we assume 
that each Rf obeys t/ and relates only one output 
to any given set of inputs. The small-step evaluation 
relation of System F is then be extended to allow / 
applied to any values to evaluate to any output related 
to those inputs by Rf. The details are straightforward 
but tedious, and so are omitted here. 

We use — > to denote the small-step evaluation 
relation of ^adt+. jg^ g — ^ g/ means that e 
evaluates in one step to e'. Typing contexts F are 
lists of type variables X that are considered in scope, 
along with pairs a; : r of expression variables x 
along with their types. Substitutions for expressions 
ei through e„ for variables xi through Xn are written 
[ei/xi, . . . ,en/xn]- These are represented with the 
letter <t, and we define capture-avoiding substitution ere 
for expressions and ar for types in the usual manner. 
Well-formedness of types and expressions is given 
respectively by the kinding judgment F h r : * and 
typing judgment F h e : r, defined in the standard 
manner. We write JF h t] for the set of all expressions 
e such that F h e : t. Expressions or types with no 
free variables are ground. Typing is extended to typing 
contexts F and substitutions in the usual manner: h F 
indicates that all types in F are well-kinded, while 
F h cr : F' indicates that Dom(cr) = Dom(F'), F h 
a{X) : * for all X e Dom(cr), and F h a{x) : T'{x) 
for all X G Dom(cr). We assume all types are well- 
kinded and all expressions, contexts, and substitutions 
are well-typed below. 

We write e | to denote that e is terminating. An 
expression context C an expression with exactly one 
occurrence of a "hole" {}, and C{e] denotes the (non- 
capture-avoiding) replacement of {} with e. Contextual 
equivalence ei = 62 is then defined to hold iff, for all 
C and r' such that • h C{eJ : r' for i e {1, 2}, we 
have that C{ei}l iff C{e2}i. We use _L to denote an 
arbitrary non-terminating expression. fix(Aa;. x) of a 
given type r. We assume below type R of real numbers, 
represented with an uncountable number of construc- 
tors, and the type R+°° of the non-negative reals with 
infinity. We use Or for the R-constructor corresponding 
to the real number 0, and <r, +r, and dp for the 
function symbols whose functions perform real number 
comparison, addition, and absolute difference, both on 
R and, abusing notation slightly, on R+°°. 



Since i^ADT+ ^^^ contain incomputable built-in 
operations, we can additionally use it as a meta- 
logic by embedding any relations of a given meta- 
language, such as set theory or type theory, as built- 
in operations. We assume an ADT Prop with the 
sole constructor T :: Prop, which will intuitively be 
used as the type of propositions; a "true" proposition 
terminates to T and a "false" one is non-terminating, 
i.e., is contextually equivalent to _L. A meta-language 
relation R can then be added to i^ADT+ ^^ ^ function 
symbol //? of type VX.r — > Prop iff i? is a set of 
tuples (t', e) such that • h //fjr'] e : Prop that 
is closed under contextual equivalence and contains 
only terminating expressions e. Note that this latter 
restriction is not too significant because we can always 
use a "thunkified" relation R' of type VX.(Unit ^• 
Ti) — > ... — > (Unit -> r„) — > Prop such that 
(r', e) e i? iff (t', Aa;. ei, . . . , Ax. e„) e R. As an 
example, we can immediately see that the thunkified 
version of contextual equivalence itself is a relation of 
type VX(Unit -^ X) ^ (Unit -^ X) ^ Prop. 

Lemma 1 (Consistency): For any class of built-in 
operations / with functions Ff, T = _L does not hold. 

In the below, </) refers to expressions of type 
VX.f — > Prop. A constrained context is a pair (F; (f)) 
of a context F and an expression such that F h 
(f) : Prop. A substitution a satisfies (F; (j)), written 
(J N (F; (/)), iff • h cr : F and cr(0) = T. We say 
that (F; 0) entails (j>, written (F; (p) \- (p true, iff 
Vcr N (F; (f>).(j{(f>) ^ T holds. Finally, we say that a 
is a substitution /rom (Fi; 0i) to (F2; 02), written 
(F2; (f>2) ^ (T ■■ (Fi; (/.i), iff F2 h 0- : Fi and 
(F2; (f>2) h a(0i) true. 

4. A Semantics of Program Approximation 

In this section, we define approximation types and 
illustrate them with examples. The main technical 
difficulty is that the straightforward way to handle 
free (expression and type) variables involves a circular 
definition. Specifically, to fit with the logical relations 
approach, e € {|a|}3i for e, a, and q with free variables 
should only hold iff ae € ^<Ja^'^ for all substitutions 
<T for exact variables x°, approximate variables x^, 
and error variables x'^ such that ax'^ e {|f2;^|}^^ for 
some approximation A'. This defines approximation 
types in terms of approximation types! To remove 
this circularity, we first define approximation types 
that handle a given set of free variables arbitrarily, 
in Section 14.11 We then show in Section 14.21 how 
to lift ground approximation types into approximation 
functions, which uniformly handle any given context 
of variables in the "right" way. 



4.1. Approximation Types 

Definition 1 (Expression-Preorder): Let F h t : *. 
A relation < is called a (F h T)-expression preorder 
iff it is a preorder (reflexive and transitive) over ^- 
equivalence classes of JF h r]. 

Definition 2 (Quantification Type): A Quantifica- 
tion type is a is a tuple Q = (F, Q, <, +, 0) of: a 
typing context F and a type Q such that F h Q : *; a 
(F h Q)-expression preorder <; and two expressions 
+ (sometimes written in infix notation) and of types 
Q ^^ Q ^ Q and Q (relative to F), respectively, such 
that is a least element for <, + is monotone with 
respect to < and (|F h Q], +, 0) forms a monoid 
with respect to the equivalence relation ^— (< n >). 
Stated differently, the following must hold: 

Closedness: ei,e2 G |F h Q] impHes ei + 62 G 

Fhgi; 

Monotonicity: ei < e'l and 62 < e'2 implies ei + 

62 < e'l +63; 

Leastness of 0: < e for all e e |F h Q]; 

Identity: e + | e for all e e |F h Q]; 

Commutativity: 61+62 ^ 62+ei for e^ € [[F h Q]; 
and 

Associativity: ei + (62 + 63) ^ (61 + 62) + 63 for 
e. e IF h Ql 
Such a tuple is sometimes called a Q- or (F h Q)- 
Quantification type. If F = • then Q is said to be 
ground. 

In the below, we often omit F when it is clear from 
context, writing (Q, <, +, 0). We also write Qq, 
<Q, +Q, and Iq for the corresponding elements of 
Q. We sometimes omit the Q subscript where Q can 
be inferred from context. The following lemma gives 
a final, implied constraint, that _L is always infinity: 

Lemma 2: For any quantification type Q, 6 <g _L 
for all 6 6 iFhQ]. 

Example 1 (Non-Negative Reals): Qr = 

(R+°°, <R, +R, Or) is a ground quantification 
type that corresponds to the standard ordering and 
addition on the non-negative reals. 

Example 2 (Non-Negative Real Functions): 
For any ground r, (t -^ R+°°, <t^r+°° 
, \xi.\x2.\y.xi y +R X2 y, Ay. Or) is a ground 
quantification type, where addition is performed 
pointwise on functions and 61 <r^R+°° ^2 iff 
ei 6 <R 62 6 for all ground e of type r; i.e., iff the 
output of 61 is always no greater than that of 62. 

Definition 3 (Approximate Equality): Let F'^ and 
V^ be any contexts with disjoint domains, let Q be 
a (F'ljF'' h Q)) -quantification type, and let E by 
any type such that F'' h £: : *. A {Q,T°,E)- 
approximate equality relation is a ternary relation 



61 «« 62 C IF^F'^ h Ql X IF° h SI X IF^ h ^1, 
where g e |F<i, F'^ h Q} and each Ci e IV" h E\, that 
satisfies the following: 

Upward Closedness: 61 «"? 62 and q < q' implies 



ei 



62; 



■62; 



Reflexivity: 61 = 62 implies 61 
Symmetry: 61 ^''62 implies 62 «''6i; 
Triangle Inequality: 61 



^91 



62 and 62 



^92 



63 



implies 61 



-.91+92 



63; and 



Completeness: 61 « 62 for all 61,62 G [[F h _E]. 

An approximation equality relation is ground iff 
F'^ = F'l = •. In the below, we often omit F'l and F° 
when clear from context. We use a subscript to denote 
which approximate equality relation is intended, as in 
61 ~^62, when it is not clear from context. 

Example 3 (Reals): The relation 61 «^ 62 that 
holds iff (J = _L, or 61 = 62, or |6i — 62 1 <r q is 
a ground (R+°", R) -approximate equality relation that 
corresponds to the standard distance over the reals. 

Example 4 (Real Functions): The relation 

61 ~R^R 62 that holds iff (7 = _L, or 61 = 62, 
or I (61 r) — (62 r)\ <r {q r r+) for all r G R and 
r+ e R+°°, is a ground (R -^ R+°° -^ R+°°, R -^ R)- 
approximate equality relation, where, intuitively, the 
distance between two functions 61 and 62 is given by 
a function q that bounds the distance between their 
outputs for each input. 

Definition 4 (Approximation Type): Let F"^, F^, and 
F^ be three domain-disjoint contexts. A (F°, F**, F'l)- 
approximation type is a tuple (Q, E, A, «, {| • |}') 
of: a (F'^,F° h (5)-quantification type Q; types E and 
A, called respectively the exact and approximate types, 
such that T" \- E : * and T'' \- A : *; a {Q^V.E)- 
approximate equality relation «; and a mapping {|a[}-'' : 
(lF<i,F<= h 0] X IF'^ h ^1) -> -P(IF<= h Ej), where 
9 e |Fi, F'^ h g] and a e p^ h A], that satisfies: 

Error Weakening: qi < q2 implies {lap^ C -flap^; 
Error Addition: 61 e {|a|}'' and 61 «'' 62 impHes 



62 e {lai 



iq+q. 



Equivalence: If q = q', a = a', and e "^ e' then 
e e ^a^i implies e' e ^a'^^'; 

Approximate Equality: 61,62 e Hap impHes 
61 Ki'i+1 e2', and 

A ground approximation type is one where F° = 
pa _ pq _ . jjj jj^g below, we write A for approxima- 
tion types, writing {|a|}^ for the set to which A maps q 
and a, ■ w^ • for the approximate equality relation, E_^ 
and A_y^ for the exact and approximate types, Q^ for 
the quantification type, and Q_4, <_4, and +_4 for the 
elements of Q_a- Again, we often omit the subscript 
A when it can be inferred from context. 

Example 5 (Floating-Point Numbers): Let 



e € {|a|}pi iff g = -L, or e = _L and a = _L, 
or |e — real (a) I <r q where real maps (the value of) 
a to its corresponding real number. We then have 
that Fl = (Qr, R, F1, wr, -a • G-Fi) is a ground 
approximation type where a real can be approximated 
by a floating-point number with an error given by the 
real-number distance between the two. 

Example 6 (Floating-Point Functions): Let e € 



\I"I/F1^F1 

ee' € II a a' 



iff q 



_L, or e 



_L and a 



_L, or 



Fl 



for all e' € {|a'G-Fr ^^ *^" have 



that Fl ^ Fl = (R ^ R+°° =^ Qr, R -^ R, Fl -^ 
Fl, ~Fi^Fi, H ■ |}fi=>fi) is ^ ground approximation 
type, where R => R"*""" => Qr is the quantification 
type obtained from Qr by applying the construction 
of Example l2] twice. Intuitively, this approximation 
type allows a real function / to be approximated by 
a floating-point function /' with an error q whenever 
q r Qr bounds the error between calling / on exact real 
number r and calling /' on a floating-point number 
with at most distance qr from r. 

4.2. Approximation Families 

Definition 5 (Approximation Families): Let F = 
F'', r^, r^, F^ for four domain-disjoint typing contexts 
and (j> be an expression such that F h : Prop. 
We say that J" = {E, A, Q, 0, +, F) is a 
(Yc^ pa pq pA^ (f)).approximation family iff F<= h ^ : 
*, F^ h A : *, F^Fi h Q : *, F^F'i h : Q, 
F'^, F*! h : Q -> Q — ^ Q, and F is a meta-language 
function from substitutions a such that cr 1= (F; (/>) to 
ground approximation types A such that Eyi^ =a{E), 
AA^a{A), Qa^^{Q), 0^ = cr(0), and +_4 = cr(+). 

We use a subscript J^ to denote the elements of F\ 
e.g., Ejr denotes the exact type E of F. We write F{a) 
for the approximation resulting from applying the F 
component to cr. A (•,•,•, •, T) -approximation family is 
called ground. Note that the ground F are isomorphic 
to the ground approximation types A, since, if F is 
ground, then the domain of Fjr consists of the sole 
pair (•; •). 

We define approximation contexts S with grammar: 



IS, (3 



:J■|S,(X^X^X<^):C|S,F.((/)) 



The form {x'^,x^,x'^) : T introduces variables x°, 
x'^, and a;'' such that x^ is an approximation of x'^ 
with error x'^ in some approximation returned by 
approximation family T. The form {X° .,X^ ,X'^) : ^ 
introduces type variables X'^, X'^, and X'^, along with 
a variable ^ that quantifies over approximations of 
X"" by X^ with eiTor X°^. Finally, the form V .{4>) 
introduces additional variables in F and constraint 6. 



More formally, let |S|^ jS]'^, |S|i, and |S|^ be 
typing contexts that contain, respectively: all a;'' and 
X'=; aU x'^ and X'"^; all x"^, X"^, and contexts F 
in a F.(0) form; and all ^ variables in S. Further, 
let |S|P be the conjunction of the following formu- 
las: a;° G {|a;''|}^ for each [xf ^x'^.,x'^) : T\ the 
formula isapprox[X'',X'^, Jf^] stating that f is an 
approximation of X'^ by X^ with error X'^ for each 
(X'=,X^,Xi) : C; and (/) for each F.(0). We use 
the abbreviations |S|'='5 = |S|MS|q and jsl'^^^^ = 
|S|'', |S|^, jSj'^, |S|^. The approximation context S is 
well-formed, written h S, iff h |S|<=^<i^ and |S|'=^i^ h 
: Prop. We say J^ is a 'E.-approximation family, 
written S h J", iff J" is a (|S|°, |S|% |S|<i, |S|A, |S|p)- 
approximation family. A cr is a substitution from S to 
S', written S' h ct : S, iff (|S'|<=^'iA; |S'|p) h a : 

^|5|eaqA. |5|p^. 

Although the functions F in approximation families 
return only ground approximations, we can create non- 
ground approximation types from F as follows. Let 
S be any approximation context and F be any El- 
approximation family. The notation S Ih F then de- 
notes the (|S|'', |S|'', |S|'i)-approximation (Q, _E, A, w 
,^ ^0 where Q = (|S|°q, Q^, <, 0^, +r) and: 

. 91 < 92 iff Vct N (|S|-<iA; |S|p). aq^ <^(,) aga; 



"7^(0 



ei w9 62 iff Vcr N (|S|<='^'iA. |2|P). crei 
(762; and 
. 6 € -Oap iff Vcr N (|S|'='^iA. |2|P). ^e e 

We call an approximation formed this way a ^- 
approximation. 

Theorem 1: If h S and S h F then S Ih F is a 
valid approximation type. 

Lemma 3 (Approximation Weakening): Let S h F 
and h S, S'. We then have that: S, S' h J^; 51 <Hihj=' 
52 impHes qi <H,H'ihjr (72; ei ~%i^jr 62 impHes 
61 «?■ ^,|^jr 62; and 6 e {|a|}2,|^^ implies e e 

We define substitution (t{F) into approxima- 
tion families as yielding the approximation family 

{aE, aA, aQ, crO, a+, \a' .F{a' o a)): 

Lemma 4 (Approximation Substitution): If S,S' h 
F and S h cr : S' then: S h cr(F); qi <H,H'ihj^ 92 
implies aqi <B.\^a(r) 0-92; ei w|_h,|h^ 62 impHes 
cr6i w^^^^,^, 0-62; and 6 e HaHI^/i^jr impHes 

Definition 6 (li- Approximations): If J^ is a S', IE- 
approximation family, then 11(5). J^ is the 5'- 
approximation family (|5|°^-£'jr, jSl'^^-Ajr, |5|'^'5^- 
Qjr, A|5|'='i.O^, A|S|<'<i.+^, F') where F'(cr) for 
o- N (|S'|°^iA; |5'|P) is defined such that: gi <F'{a) 92 

iff 91 isr^ <Hi 



3hCT(F) 92 1^1"'^; ei 



^'(a) 62 iff 



ei i^r 



.« im" 



62 1^1°; and e G 



iff 



(e|Sne^a|Sr^'l"l^' 



r|^g.(_FV Intuitively, 11(5). J^ forms 
an approximation family where AjSj'^.e is approxi- 
mated by A|S|^.a with error AjSj^'^.q whenever e 
is approximated by a with error q in approxima- 
tion context S. When S is just the single element 
{x°, x^, x"^) -.Fi and F2 does not depend on the values 
substituted for x'', x^, or x'^, then n(S).J^2, which we 
abbreviate as Fi => 7^2, yields the notion of function 
approximation types discussed in Section [2] and in 
Example l6] 

To approximate the polymorphic type VX.E we use 

z+:X'i^X'i^ XI. (z" = O5 A z+ = +^)).T 

This approximation family quantifies over the type 
variables X'^, X^, and X'^ for the exact, approximate, 
and error types, as well as over the variables z" 
and z+ for the zero error and error addition of the 
approximation type ^. The latter variables are explicitly 
abstracted in order to allow error expressions to refer 
to them: recall that ^ is only bound in |S|p, not \'B.\'^; 
i.e., error terms refer only to X° and X"^, not to 
^. Abusing notation slightly, we abbreviate the above 
approximation family as Il{X,z : C).J-^. 

5. Verifying an Approximating Compiler 

As discussed in the Introduction, the long-term goal 
of this work is to enable language-based approxima- 
tions, where a compiler or other tool performs ap- 
proximate transformations in a correct and automated 
manner. In this section, we show how to verify such a 
tool, the goal of the current work, with the semantics 
given in the previous section. Specifically, we consider 
a tool that performs two transformations: it compiles 
real numbers into floating-point implementations; and 
it optionally performs loop perforation lfT2l . |fT9l . For 
the current work, we assume only that our tool can be 
specified with an approximate compilation judgment 
She~^a<(j:.4 such that this judgment 
can be derived whenever the tool might approximate 
exact expression e by a with error bound by q using 
approximation type A and assumptions S. Intuitively, 
each rule of this judgment corresponds to an approxi- 
mate transformation that the tool might perform; for 
example, the approximation rule of Section |2] for 
approximating sin by Ax. x, might be included. We 
ignore the specifics of how the tool chooses which 
approximations to use where, as long as all possible 
choices are contained in the approximate compilation 
judgment. 



To verify such a tool, we then prove soundness of 
its approximate compilation judgment. Soundness here 
means that 'E.'^ e -^ a < q : A implies e e {|a|}2.|^_^. 
This can be proved in a local, modular fashion, by 
verifying each approximation rule individually; more 
specifically, if an approximation rule derives S h e -^ 
a < q : A from assumptions S, h e^ ^^ ai < qi : Ai 
for 1 < i < n and side conditions 0j for 1 < j < m. 



then the rule is correct iff e e {|a[}-2.|^^ holds whenever 
Ci G HaiH^'i^^. for all i and (f)j holds for all j. 
This also allows extensibility, since additional rules can 
always be added as long as they are proved correct. 
In the remainder of this section, we consider rules 
that would be used in our example tool, including: 



compositionality rules (Section 5.1 1; rules for replacing 
real numbers by floating-point implementations (Sec- 
tion 5.2 1; and a rule for performing loop perforation 



(Section BJ 



5.1. Compositionality Rules 

In order to combine errors from individual approxi- 
mate transforms into a single, whole-program error, we 
now introduce the compositionality rules. These rules, 
given in Figure l2] are essentially the identity, stating 
that each expresison construct can be approximated by 
itself; however, they show how to build up and combine 
error expressions for different contructs. 

The first rule, A-Weak, allows the error bound 
to be weakened from q to any greater error q' . The 
A-Var rule approximates variable x'' by x^ with error 
x'^ when these variables are associated in S. The 
rule A-Lam approximates a lambda-abstraction Ax°. e 
with a lambda-abstraction Ax^. a by approximating 
the body e by a, using the error function Ax'^. Ax^^. q 
that abstracts over the input x'' and its approximation 
error x'^. The A-APP rule approximates applications 
ei 62 by applying the approximation of ci to that 
of 62 and applying the error for ei to both 62 and 
its error. The A-TLam rule approximates polymor- 
phic lambdas h.X'^.e by approximating the body e 
in the extended approximation context S,X, 2 : ^, 
recalling the abbreviation X^z : ^ from Section |4.2| 
that abstracts the various components of approxima- 
tion families. A-TAPP approximates type applications 
e[Ejr\ where the type involved is the exact type of 
some approximation family F. This is accomplished 
by first approximating e to some a with error q in the 
polymorphic approximation n(X, z : £,).F' introduced 
in Definition l6] and then applying a to the approximate 
type Ajr of F. The error q is applied to the necessary 
components of F, and the resulting approximation is 
F' with F substituted for ^ and all the appropriate 
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Figure 2. Compositionality Rules for Approximate Compilation 



components of J^ substituted for the X and z variables. 
This is abbreviated as [J^/C, ■ ■ ]^' ■ 

Fixed-points fix(e) are approximated using A-Fix. 
This approximates e to a with error q, applying fix 
to the results in the conclusion. The approximation 
family used for e is J^ =^ J^, augmented with the 
assumption that the inputs x'', x^, and x"^ are equal 
to the exact, approximate, and error fix-expressions in 
the conclusion of the rule. 

Finally, if-expressions are approximated with A-lF. 
First, each component of the if-expression is approx- 
imated; the condition can use an arbitrary approxi- 
mation family F', while the then and else branches 
must use the same family as the whole expression. The 
final condition requires that the output error q bounds 
the error between the branches taken in the exact and 
approximate expressions, even if one takes the then 
branch and the other takes the else branch. This is 
stated by quantifying over all four combinations of 
True or False in the exact and approximate conditions 
that meet the error q' computed for approximating the 
if-condition, and then requiring that the corresponding 
then or else branches are within error q. 

Lemma 5: Each rule of Figure [2] is sound. 

5.2. Floating-Point Approximation 

To approximate real-number programs with their 
floating-point equivalents, we use the Fl approxima- 
tion type formalized in Example |5] We then add rules 



^ h r -^ real(r) < |r — real(r)| : Fl 



S h op"^ -> op'''°"* < op9 : Fl ^ Fl =^ Fl 



for each real number r and each built-in binary oper- 
ation (such as +, *, etc.) op, where op^ and op'^'°^* 
are the version of op for reals and floating-points, 
respectively, and op' is the error function calculating 
the size of the interval error in the output from those 
of the inputs. The error functions op'' to use for the 
various operations can be derived in a straightforward 
manner by considering the smallest error that bounds 
the difference between the real result of op^ and any 
potential result of op'^'°^* on floating-point numbers 
in the input interval. The infinite error _L is returned 
in case of overflow or Not-A-Number results. For 
instance, the error +' can defined (in pseudocode) as: 

+'^ xe x'i 2/° y'^ = 
let/ = round{[{x° -x'i) + {y'' -y'i), 

(a;<= + a;^) + (z/ + 2;q)])in 
if 3r e I.\r\ > MAXFLOAT then _L else 
maxr'g/ \r' — r\ 

where round rounds all reals in an interval to floating- 
point numbers (using the current rounding mode) and 
MAXFLOAT is the maximum absolute value of the 
floating-point representation being used. The errors for 
other operations can be defined similarly. 

Using these rules with the compositionality rules of 
Figure l2] yields an interval-based floating-point error 
analysis that works for higher-order and even poly- 
morphic terms. Although recent work has given more 
precise floating-point error analyses than intervals ifTOl . 
(5], we anticipate, as future work, that such approaches 
can also be incorporated into our framework, allowing 
them to be used on higher-order, polymorphic pro- 
grams. 

Theorem 2: The rules listed above for compiling 
reals to floating-points are sound. 



HI-ei^>ai<gi:J^=>J^=>J^ 

S\- €2 -^ a2 < : Nat 

E \- 63 -~^ a3 < q3 : Nat ^ T ei x ~s%,K,^jrei [x]k 

red-seq ei 62 63 «|||_jr red-seq ei [62] k 63 

H h red-seq ei 62 63 

■^ red-seq (Ax. (ai x)^) {a2/K) {\x. 03(3; * /i')) 

< (red-seq +jf e2 (Ax. (53 [x\kO) +t (gx)))+jrg' ; 7^ 

Figure 3. Loop Perforation 
5.3. Loop Perforation 

Loop perforation lfT2l . ||T9]| is a powerful approxi- 
mate transformation that takes a loop which combines 
/(O), /(I), etc., for some computationally intensive 
function /, and only performs every ?ith iteration, 
repeating the values that are computed n times. We 
formalize a simplified version of loop perforation as 
operating on reductions red-seq e\ 62 63, defined as 
the function that reduces, using e\, the sequence of 
values of 63 from less than or equal to 62. The 
rule in Figure [3] shows how to perforate this reduction 
by first approximating each e^ to some a^ with error 
qi. The rule requires that the number e-i of sequence 
elements can be approximated with error; this is 
not a fundamental limitation, but is simply made here 
to simplify the exposition. The approximation type 
Nat is defined similarly to Fl, except that the natural 
numbers are used for the exact, approximate, and error 
types. Next, the rule finds an error function q that, 
when applied to input x, bounds the error between 
the exact sequence value e\ x and the xth value in 
the perforated sequence, which can be calculated as 
ei [njif. Finally, the rule considers the fact that n 
may not be an exact multiple of K, in which case the 
perforated loop actually computes the result of [62] k 
iterations. An error g' is thus synthesized to bound the 
error between this perforated result and the original 
reduction. The resulting approximation then performs 
the perforation computation as described above, and 
the error sums the errors (93 \x\ k 0), capturing the 
approximation error of 03, with (gx), capturing the 
error of using \x\k instead of x, for each value x, 
and then adds the error g'. 

Theorem 3: The rule of Figure l3] is sound. 

6. Related Work 

A number of recent papers relate to program ap- 
proximations. Possibly the closest to this work is the 
work of Reed and Pierce 1 16 1, because they consider a 
higher-order (though not polymorphic) input language. 
They show how to perform an approximate program 
transformation that adds noise to a database query to 



ensure differential privacy, i.e., that a query cannot 
violate the privacy of a single individual recorded in 
the database. In order to ensure that adding noise does 
not change the query results too much, a type system 
is used to capture functions that are A'-Lipschitz, 
meaning that a change of S in the input yields a change 
of at most K * S in the output. This condition can be 
captured in our system by the error Xx^-Xx"^. K *x'^ in 
an approximation of a function. 

Loop perforation lfT2]| . lfT9l transforms certain map- 
reduce programs, written as f or-loops in C, to per- 



form only a subset of their iterations. Section 5.3 
shows how to capture a variant of this transformation 
in our system. More recent work 121] has extended 
loop perforation to sampling, which takes a random 
(instead of a controlled) sample of the iterations of a 
loop. This work also adds substitution transformations, 
where different implementations of the same basic 
operations (such as sin or log) are substituted to try 
to trade off accuracy for performance. 

EnerJ ifTSl allows the programmer to specify, with 
type modifiers, whether data is exact or inexact. Inexact 
data means the program can tolerate errors in this data. 
Such data is then stored in low-power memory, which 
is cheaper but is susceptible to random bit flips. 

Carbin et al. [2] present a programming model 
with relaxation, where certain values in a program are 
allowed to vary in a pre-defined way, to model errors. 
The authors then show how to verify properties of 
relaxed programs despite these errors. Although this 
work addresses some of the same questions as ours, 
one significant drawback is that it cannot represent 
the relationship between expressions of different types, 
such as real-valued functions and functions on floating- 
point numbers used to implement them. 

Approximate bisimulation ||9| is a new technique for 
relating discrete, continuous and hybrid systems in a 
manner that can tolerate errors in a system. We are 
still investigating the relationships between our system 
and approximate bisimulation, but one key difference 
with our work is that this approach considers only 
transition systems, and so cannot be applied to higher- 
order programs, programs with recursive data, etc.; 
however, such systems still encompass the large and 
practical class of control systems. 

Chaudhuri et al. f?!, f3l have investigated a static 
analysis for proving robustness of programs, which 
they then argue is a useful precondition for approx- 
imate transformations. They define robustness as the 
i^-Lipschitz condition, which, again, can be captured 
in our system as the error Xx^.Xx'^. K * x^. 

Floating-point error analysis fTOl, fSl bounds the 
error between a floating-point program and the syntaci- 



cally equivalent real-number program. We showed 



how to accomplish this in our setting in Section 5.2 



yielding what we believe is the first floating-point error 
analysis to work for higher-order and polymorphic 
programs. Although state-of-the-art analyses use affine 
expressions, instead of intervals as we used above, we 
anticipate that we will be able to accommodate these 
approaches in our semantics as well, thereby adapting 
them to higher-order and polymorphic programs. 

On a technical note, our quantification types and 
approximate equality relations were inspired by Flagg 
and Kopperman's continuity spaces (HI, one of the few 
works that considers a more general notion of distance 
than metric spaces. Specifically, the distributive lattice 
completion of a quantification type corresponds exactly 
to Flagg and Kopperman's abstract notion of distance, 
called a quantale. A similar transformation turns an 
approximate equality relation into a continuity space, 
their generalization of metric spaces. 

7. Conclusion 

We have introduced a semantics for approximate 
program transformations. The semantics relates an 
exact program to an approximation of it, and quantifies 
this relationship with an error expression. Rather than 
specifying errors solely with real numbers and metric 
spaces, our approach is based on approximation types, 
an extension of logical relations that allows us to use, 
for example, functions as the errors for approximations 
of functions, and polymorphic types as the errors for 
polymorphic types. We then show how approximation 
types can be used to verify approximate transforms in 
a modular, composable fashion, by proving soundness 
of each transform individually and by including a set of 
compositionality rules, also proved correct, that com- 
bine errors from individual approximate transforms 
into a whole-program error. 
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